Henrike Barkmann Master Thesis 1
نویسنده
چکیده
The validation of software quality metrics lacks statistical significance. One reason for this is that the data collection requires quite some effort. As a help to solve this problem, we develop tools for metrics analysis of a large number of software projects. Moreover, a validation of software quality metrics should focus on relevant metrics, i.e., correlated metrics need not to be validated independently. First, we extract the source code and associated meta-data from the SVN and CVS repositories of SourceForge.NET projects. We developed SF Extract for collecting the metadata, which can connect to SourceForge, read the list of all Java projects available, follow the links to each project home page, parse the home page (HTML) to extract the relevant properties, and store them in the database for further processing. Since the metadata also contains information about the url of the SVN or CVS repository containing the source code, we could develop SVN Extract, which downloads the source code of these projects based on the information in the database in an automated and efficient way. The files and folders of the individual projects in the repository are stored in local working copies. To assure that the downloaded projects are complete and compiled, we import them manually into an Eclipse workspace as Java projects. So far, we could not find a way to automate this step in our process. The VizzAnalyzer metrics tool is fed with lowlevel information from the Eclipse project (syntax, cross references, etc.) and computes the metrics. We implemented an export engine to store the computed metrics in a database for later processing. For the actual statistical analysis of the metrics, we use MS Excel and SPSS both with access to the database. We select 146 open-source Java projects randomly apart from the practical constraints. Altogether, 32% of the 146 projects needed manual fixes in order to enable analysis. The software metrics selected are a collection of the most popular metrics discussed in literature. They are taken from different well know metrics suites like Chidamber & Kemerer [5]. For all pairs of metrics (considered in this study) we aim at invalidating either of the following hypothesis: H0 The pair of metrics values is independent in all software systems (considered in the study). H1 The pair of metrics values is dependent, that is, showing a statistically significant correlation between the measured values in all software systems (considered in the study). Our first contribution provides tool support for collecting large amounts of quantitative data on open-source software systems written in Java. Our second contribution reduces the number of metrics to validate. We could show correlation among the individual metrics, indicating that some of them seem to measure the same properties. The coefficients of correlation show strong connections between 5 metric pairs with results greater or equal 0.90. Four of the six metrics involved can be excluded, since they are redundant. Based on these findings, we can reject our hypothesis H0 for some pairs of metrics and thus support our alternative hypothesis H1. Finally, we describe metric values statistically, getting a first overview of the absolute value ranges for some wellknown metrics http://sourceforge.net http://www.arisa.se http://www.spss.com
منابع مشابه
Evaluating the quality of master degree thesis of Educational Psychology graduates
The purpose of the present research was to evaluate and identify the quality and the weaknesses and strengths of different sections of master's degree thesis in educational psychology at Tehran universities. The research method was evaluation and the statistical population included all the master's degree theses in the field of educational psychology at Tehran universities during the 2013-2016 ...
متن کاملA Thesis Submitted in Partial Fulfilment of the Requirement for the Degree of Master of Technolgy
IV List of figures V List of Tables VII 1. Thesis Overview 1 1.
متن کاملProbabilistic Graphical Models - Studienarbeit - Florian
solemnly declare that I have written this master thesis independently, and that I have not made use of any aid other than those acknowledged in this master thesis. Neither this master thesis, nor any other similar work, has been previously submitted to any examination board.
متن کاملAre Nursing Students’ Thesis Topics in Accordance with Burden of Diseases as Disability Adjusted Life Years in Iran?
Abstract Introduction: Research is the basis of nursing and should be in accordance with society’s health needs. The purpose of this study was to determine whether the master theses’ subjects in nursing conform to the burden of diseases as Disability Adjusted Life Years (DALYs). Methods: In this comparative study, 373 researches registered in the lists of school libraries or university web ...
متن کاملii DATA MINING TECHNIQUES IN EMBOLI DETECTION
Director I certify that this thesis satisfies all the requirements as a thesis for the degree of Master of Science This is to certify that we have read this thesis and that in our opinion it is fully adequate, in scope and quality, as a thesis for the degree of Master of Science.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009